17 research outputs found
Recommended from our members
CANDS: A Computational Implementation of Collins and Stabler (2016)
Syntacticians must keep track of the empirical coverages and the inner workings of syntactic theories, a task especially demanding for minimalist syntacticians to perform manually and mentally. We believe that the computational implementation of syntactic theories is desirable in that it not only (a) facilitates the evaluation of their empirical coverages, but also (b) forces syntacticians to specify their inner workings. In this paper, we present CANDS, a computational implementation of Collins AND Stabler (2016) in the programming language Rust. Specifically, CANDS consists of one main library, cands, as well as two wrapper programs for cands, derivck and derivexp. The main library, cands, implements key definitions of fundamental concepts in minimalist syntax from Collins and Stabler (2016), which can be employed to evaluate and extend specific syntactic theories. The wrapper programs, derivck and derivexp, allow syntacticians to check and explore syntactic derivations through an accessible interface
Recommended from our members
Modeling Morphological Processing in Human Magnetoencephalography
In this paper, we conduct a magnetoencephalography (MEG) lexical decision experiment and computationally model morphological processing in the human brain, especially the Visual Word Form Area (VWFA) in the visual ventral stream. Five neurocomputational models of morphological processing are constructed and evaluated against human neural activities: Character Markov Model and Syllable Markov Model as amorphous models without morpheme units, and Morpheme Markov Model, Hidden Markov Model (HMM), and Probabilistic Context-Free Grammar (PCFG) as morphous models with morpheme units structured linearly or hierarchically. Our MEG experiment and computational modeling demonstrate that morphous models outperformed amorphous models, PCFG was most neurologically accurate among morphous models, and PCFG better explained nested words with non-local dependencies between prefixes and suffixes. These results strongly suggest that morphemes are represented in the human brain and parsed into hierarchical morphological structures
Recommended from our members
Learning Argument Structures with Recurrent Neural Network Grammars
In targeted syntactic evaluations, the syntactic competence of LMs has been investigated through various syntactic phenomena, among which one of the important domains has been argument structure. Argument structures in head-initial languages have been exclusively tested in the previous literature, but may be readily predicted from lexical information of verbs, potentially overestimating the syntactic competence of LMs. In this paper, we explore whether argument structures can be learned by LMs in head-final languages, which could be more challenging given that argument structures must be predicted before encountering verbs during incremental sentence processing, so that the relative weight of syntactic information should be heavier than lexical information. Specifically, we examined double accusative constraint and double dative constraint in Japanese with the sequential and hierarchical LMs: n-gram model, LSTM, GPT-2, and RNNG. Our results demonstrated that the double accusative constraint is captured by all LMs, whereas the double dative constraint is successfully explained only by the hierarchical model. In addition, we probed incremental sentence processing by LMs through the lens of surprisal, and suggested that the hierarchical model may capture deep semantic roles that verbs assign to arguments, while the sequential models seem to be influenced by surface case alignments
Wh-Concord in Okinawan = Syntactic Movement + Morphological Merger
The main purpose of this paper is to provide a novel account for Wh-Concord in Okinawan based on the Copy Theory of Movement and Distributed Morphology. We propose that Wh-Concord interrogatives and Japanese-type wh-interrogatives have exactly the same derivation in the syntactic component: the Q-particle -ga, base-generated as adjoined to a wh-phrase, undergoes movement to the clause-final position. The two types of interrogatives are distinguished in the post-syntactic component: only in Wh-Concord, the -r morpheme on C0 triggers Morphological Merger, which makes it possible to Spell-Out lower copy of -ga. It is shown that the proposed analysis correctly predicts three descriptive generalizations on the distribution of -ga in (i) syntactic islands, (ii) subordinate clauses, and (iii) (embedded) multiple wh-interrogatives
Composition, Attention, or Both?
In this paper, we propose a novel architecture called Composition Attention
Grammars (CAGs) that recursively compose subtrees into a single vector
representation with a composition function, and selectively attend to previous
structural information with a self-attention mechanism. We investigate whether
these components -- the composition function and the self-attention mechanism
-- can both induce human-like syntactic generalization. Specifically, we train
language models (LMs) with and without these two components with the model
sizes carefully controlled, and evaluate their syntactic generalization
performance against six test circuits on the SyntaxGym benchmark. The results
demonstrated that the composition function and the self-attention mechanism
both play an important role to make LMs more human-like, and closer inspection
of linguistic phenomenon implied that the composition function allowed
syntactic features, but not semantic features, to percolate into subtree
representations.Comment: Accepted by Findings of EMNLP 202
JCoLA: Japanese Corpus of Linguistic Acceptability
Neural language models have exhibited outstanding performance in a range of
downstream tasks. However, there is limited understanding regarding the extent
to which these models internalize syntactic knowledge, so that various datasets
have recently been constructed to facilitate syntactic evaluation of language
models across languages. In this paper, we introduce JCoLA (Japanese Corpus of
Linguistic Acceptability), which consists of 10,020 sentences annotated with
binary acceptability judgments. Specifically, those sentences are manually
extracted from linguistics textbooks, handbooks and journal articles, and split
into in-domain data (86 %; relatively simple acceptability judgments extracted
from textbooks and handbooks) and out-of-domain data (14 %; theoretically
significant acceptability judgments extracted from journal articles), the
latter of which is categorized by 12 linguistic phenomena. We then evaluate the
syntactic knowledge of 9 different types of Japanese language models on JCoLA.
The results demonstrated that several models could surpass human performance
for the in-domain data, while no models were able to exceed human performance
for the out-of-domain data. Error analyses by linguistic phenomena further
revealed that although neural language models are adept at handling local
syntactic dependencies like argument structure, their performance wanes when
confronted with long-distance syntactic dependencies like verbal agreement and
NPI licensing
Psychometric Predictive Power of Large Language Models
Next-word probabilities from language models have been shown to successfully
simulate human reading behavior. Building on this, we show that, interestingly,
instruction-tuned large language models (LLMs) yield worse psychometric
predictive power (PPP) for human reading behavior than base LLMs with
equivalent perplexities. In other words, instruction tuning, which helps LLMs
provide human-preferred responses, does not always make them human-like from
the computational psycholinguistics perspective. In addition, we explore
prompting methodologies in simulating human reading behavior with LLMs, showing
that prompts reflecting a particular linguistic hypothesis lead LLMs to exhibit
better PPP but are still worse than base LLMs. These highlight that recent
instruction tuning and prompting do not offer better estimates than direct
probability measurements from base LLMs in cognitive modeling.Comment: 8 page
Cross-linguistic patterns of morpheme order reflect cognitive biases: An experimental study of case and number morphology
A foundational goal of linguistics is to investigate whether shared features of the human cognitive system can explain how linguistic patterns are distributed across languages. In this paper we report a series of artificial language learning experiments which aim to test a hypothesised link between cognition and a persistent regularity of morpheme order: number morphemes (e.g., plural markers) tend to be ordered closer to noun stems than case morphemes (e.g., accusative markers) (Universal 39; Greenberg, 1963). We argue that this typological tendency may be driven by learners’ bias towards orders that reflect scopal relationships in morphosyntactic and semantic composition (Bybee, 1985; Rice, 2000; Culbertson & Adger, 2014). This bias is borne out by our experimental results: learners—in the absence of any evidence on how to order number and case morphology—consistently produce number closer to the noun stem. We replicate this effect across two populations (English and Japanese speakers). We also find that it holds independent of morpheme position (prefixal or suffixal), degree of boundedness (free or bound morphology), frequency, and which particular case/number feature values are instantiated in the overt markers (accusative or nominative, plural or singulative). However, we show that this tendency can be reversed when the form of the case marker is made highly dependent on the noun stem, suggesting an influence of an additional bias for local dependencies. Our results provide evidence that universal features of cognition may play a causal role in shaping the relative order of morphemes
Context Limitations Make Neural Language Models More Human-Like
Language models (LMs) have been used in cognitive modeling as well as
engineering studies -- they compute information-theoretic complexity metrics
that simulate humans' cognitive load during reading. This study highlights a
limitation of modern neural LMs as the model of choice for this purpose: there
is a discrepancy between their context access capacities and that of humans.
Our results showed that constraining the LMs' context access improved their
simulation of human reading behavior. We also showed that LM-human gaps in
context access were associated with specific syntactic constructions;
incorporating syntactic biases into LMs' context access might enhance their
cognitive plausibility.Comment: Accepted by EMNLP2022 (main long
Design of BCCWJ-EEG : Balanced Corpus with Human Electroencephalography
Waseda UniversityNational Institute for Japanese Language and LinguisticsThe past decade has witnessed the happy marriage between natural language processing (NLP) and the cognitive science of language. Moreover, given the historical relationship between biological and artificial neural networks, the advent of deep learning has re-sparked strong interests in the fusion of NLP and the neuroscience of language. Importantly, this inter-fertilization between NLP, on one hand, and the cognitive (neuro)science of language, on the other, has been driven by the language resources annotated with human language processing data. However, there remain several limitations with those language resources on annotations, genres, languages, etc. In this paper, we describe the design of a novel language resource called BCCWJ-EEG, the Balanced Corpus of Contemporary Written Japanese (BCCWJ) experimentally annotated with human electroencephalography (EEG). Specifically, after extensively reviewing the language resources currently available in the literature with special focus on eye-tracking and EEG, we summarize the details concerning (i) participants, (ii) stimuli, (iii) procedure, (iv) data preprocessing, (v) corpus evaluation, (vi) resource release, and (vii) compilation schedule. In addition, potential applications of BCCWJ-EEG to neuroscience and NLP will also be discussed